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Abstract 

We prove the approximation ratio 8/5 for the metric {s, i}-path-TSP problem, and more 
generally for shortest connected T-joins. 

The algorithm that achieves this ratio is the simple "Best of Many" version of Christofidcs' 
algorithm (1976), suggested by An, Klcinberg and Shmoys (2012), which consists in determin- 
ing the best Christofidcs {s, i}-tour out of those constructed from a family .F>o of trees having 
a convex combination dominated by an optimal solution x* of the fractional relaxation. They 
give the approximation guarantee v ^ +1 for such an {s,i}-tour, which is the first improve- 
ment after the 5/3 guarantee of Hoogeveen's Christofides type algorithm (1991). Cheriyan, 
Friggstad and Gao (2012) extended this result to a 13/8- approximation of shortest connected 
T-joins, for |T| > 4. 

The ratio 8/5 is proved by simplifying and improving the approach of An, Kleinberg and 
Shmoys that consists in completing x* /2 in order to dominate the cost of "parity correction" 
for spanning trees. We partition the edge-set of each spanning tree in J r > o into an {s, t}-path 
(or more generally, into a T-join) and its complement, which induces a decomposition of x*. 
This decomposition can be refined and then efficiently used to complete x* /2 without using 
linear programming or particular properties of T, but by adding to each cut deficient for x* /2 
an individually tailored explicitly given vector, inherent in x* . 

A simple example shows that the Best of Many Christofidcs algorithm may not find a 
shorter {s,i}-tour than 3/2 times the incidentally common optima of the problem and of its 
fractional relaxation. 

keywords: traveling salesman problem, path TSP, approximation algorithm, T-join, polyhedron 

1 Introduction 

A Traveling Salesman wants to visit all vertices of a graph G = (V,E), starting from his home 
s € V, and - since it is Friday - ending his tour at his week-end residence, t € V. Given the 
nonnegative valued length function c : E — > R + , he is looking for a shortest {s,t}-tour, that is, 
one of smallest possible (total) length. 

The Traveling Salesman Problem (TSP) is usually understood as the s = t particular case 
of the defined problem, where in addition every vertex is visited exactly once. This "minimum 
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length Hamiltonian circuit" problem is one of the main exhibited problems of combinatorial 
optimization. Besides being NP-hard even for very special graphs or lengths [11], even the best 
up to date methods of operations research, the most powerful computers programmed by the 
brightest hackers fail solving reasonable size problems exactly. 

On the other hand, some implementations provide solutions only a few percent away from the 
optimum on some large "real-life" instances. A condition on the length function that certainly 
helps both in theory and practice is the triangle inequality. A nonnegative function on the edges 
that satisfies this inequality is called a metric function. The special case of the TSP where G is 
a complete graph and c is a metric is called the metric TSP. For a thoughtful and distracting 
account of the difficulties and successes of the TSP, see Bill Cook's book [5]. 

If c is not necessarily a metric function, the TSP is hopeless in general: it is not only JVP-hard 
to solve but also to approximate, and even for quite particular lengths, since the Hamiltonian 
cycle problem in 3- regular graphs is JVP-hard [TT] . The practical context makes it also natural to 
suppose that c is a metric. 

A p- approximation algorithm for a minimization problem is a polynomial-time algorithm that 
computes a solution of value at most p times the optimum, where p € R, p > 1. The guarantee 
or ratio of the approximation is p. 

The first trace of allowing s and t be different is Hoogeveen's article [15], providing a Christofides 
type 5/3-approximation algorithm, again in the metric case. There had been no improvement un- 
til An, Kleinberg and Shmoys [1] improved this ratio to 1+ 2 V ^ < 1.618034 with a simple algorithm, 
an ingenious new framework for the analysis, but a technically involved realization. 

The algorithm first determines an optimum x* of the fractional relaxation; writing x as a 
convex combination of spanning trees and applying Christofides' heuristic for each, it outputs the 
best of the arising tours. For the TSP problem x* jl dominates any possible parity correction, 
as Wolsey [22] observed, but this is not true if s 7^ t. However, [I] manages to perturb x*/2, 
differently for each spanning tree of the constructed convex combination, with small average 
increase of the length. 

We adopt this algorithm and this global framework for the analysis, and develop new tools 
that essentially change its realization and shortcut the most involved parts. This results in a 
simpler analysis guaranteeing a solution within 8/5 times the optimum. 

We did not fix that the Traveling Salesman visits each vertex exactly once, our problem 
statement requires only that every vertex is visited at least once. This version has been introduced 
by Cornuejols, Fonlupt and Naddef [6] and was called the graphical TSP. In other words, this 
version asks for the "shortest spanning Eulerian subgraph", and puts forward an associated 
polyhedron and its integrality properties, characterized in terms of excluded minors. 

This version has many advantages: while the metric TSP is defined on the complete graph, the 
graphical problem can be sparse, since an edge which is not a shortest path between its endpoints 
can be deleted; however, it is equivalent to the metric TSP (see Tours below); the length function 
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c does not have to satisfy the triangle inequality; this version has an unweighted special case, 
asking for the minimum size of a spanning Eulerian subgraph. 

The term "graphic" or "graph-TSP" has eventually been taken by this all 1 special case, that 
we do not investigate here and avoid these three terms used in a too diversified way, different 
from habits for other problems. For comparison, let us only note the guaranteed ratios for the 
cardinality versions of the problems: 3/2 for the min cardinality of a spanning connected subgraph 
with two given odd degree vertices, and 7/5 if all vertices are of even degree [21j . 

2 Notation, Terminology and Preliminaries 

The set of non-negative real numbers is denoted by R +) Q denotes the set of rational numbers. 
We fix the notation G = (V, E) for the input graph. For X C V we write S(X) for the set of 
edges with exactly one endpoint in X. If w : E — > M and A C E, then we use the standard 
notation w(A) := J2e&A w ( e )- 

Tours: For a graph G = (V, E) and T C V with |T| even, a T-join in G is a set F C E such that 
T = {v E V : \S(v) PI F\ is odd}. For (G, T), where G is connected, it is well-known and easy to 
see that a T-join exists if and only if |T| is even |17j . |16] , A T-tour (T C V) of G = (V,E) is a 
set F C 2E such that 

(i) F is a T-join of 2G, 

(ii) (V, F) is a connected multigraph, 

where 2E is the multiset consisting of the edge-set E, and the multiplicity of each edge is 2; we 
then denote 2G := (V, 2E). It is not false to think about 2G as G with a parallel copy added to 
each edge, but we find the multiset terminology better, since it allows for instance to keep the 
length function and its notation c : E — > R + , or in the polyhedral descriptions to allow variables 
to take the value 2 without increasing the number of variables; the length of a multi-subset will 
be the sum of the lengths of the edges multiplied by their multiplicities, with obvious, unchanged 
terms or notations: for instance the size of a multiset is the sum of its multiplicities; xa is the 
multiplicity vector of A] x(A) is the scalar product of x with the multiplicity vector of A; a 
subset of a multiset A is a multiset with multiplicities smaller than or equal to the corresponding 
multiplicities of A, etc. 

A tour is a T-tour with T = 0. 

When (G, T) or (G, T, c) are given, we always assume without repeating, that G is a connected 
graph, |T| is even, and c : E — > R + . The latter will be called the length function, c(A) (A C E) 
is the length of A. The T-tour problem (TTP) is to minimize the length of a T-tour for (G, T, c) 
as input. The subject of this work is the TTP for an arbitrary length function. 

If F C E, we denote by Tp the set of vertices incident to an odd number of edges in F; if F 
is a spanning tree, F{T) denotes the unique T-join of F. 



3 



The sum of two (or more) multisets is a multiset whose multiplicities are the sums of the two 
corresponding multiplicities. If X, Y C E, X + Y C 2E and (V, X + y) is a multigraph. Given 
(G,T), F C i? such that (V, T) is connected, and a Tp AT-join Jp, the multiset F + Jp is a 
T-tour; the notation "A" stays for the symmetric difference (mod 2 sum of sets). 

In |21] T-tours were introduced under the term connected T -joins. (This first name may be 
confusing, since T-joins have only or 1 multiplicities.) Even if the main target remains \T\ < 2, 
the arguments concerning this case often lead out to problems with larger T. 

By "Euler's theorem" a subgraph of 2G is a tour or {s,t}-tour if and only if its edges can be 
ordered to form a closed "walk" or a walk from s to t, that visits every vertex of G at least once, 
and uses every edge as many times as its multiplicity. 

For the TTP, a 2-approximation algorithm is trivial by taking a minimum cost spanning tree 
F and doubling the edges of a Tp AT-join of F, that is, of T(TpAT). 

For T = 0, Christofides [3] proposed determining first a minimum length spanning tree F 
to assure connectivity, and then to add to it a shortest Tp-join. The obvious approximation 
guarantee 3/2 of this algorithm has not been improved ever since. A Christofides type algorithm 
for general T adds a shortest Tp AT-join instead. 

For T = {s,t} (s,t S V) this has been proved to guarantee a ratio of 5/3 by Hoogeveen 
|15] and improved by An, Kleinberg and Shmoys [T]. Hoogeveen's approach and ratio can be 
obviously extended to T-tours for arbitrary T providing the same guarantee with a Christofides 
type algorithm and proof [211 Introduction] . In Section [3] we show an "even more Christofides 
type" proof, relevant for our improved ratio 8/5 (see Proposition). Cheriyan, Friggstad and Gao 
[4] provided the first ratio better than 5/3 for arbitrary T, by extending the analysis of PQ, with 
extra work, different for |T| > 4, leading to the ratio 13/8 = 1.625. 

Minimizing the length of a tour or {s,t}-tour is equivalent to the metric TSP problem or its 
path version (with all degrees 2 except s and t of degree 1, that is, a shortest Hamiltonian circuit 
or path). Indeed, any length function of a connected graph can be replaced by a function on the 
complete graph with lengths equal to the lengths of shortest paths (metric completion): then a 
tour or an {s,t}-tour can be "shortcut" to a sequence of edges with all inner degrees equal to 2. 
Conversely, if in the metric completion we have a shortest Hamiltonian circuit or path we can 
replace the edges by paths and get a tour or {s,t}-tour. 

Given (G,T,c), the minimum length of a T-join in G is denoted by r(G,T,c). A T-cut is 
a cut S(X) such that \X n Tj is odd. It is easy to see that a T-join and a T-cut meet in an 
odd number of edges. If in addition c is integer, the maximum number of T-cuts so that every 
edge is contained in at most c of them is denoted by v(G,T,c). By a theorem of Edmonds and 
Johnson [8], |17| t(G, T, c) = v(G,T,2c)/2, and a minimum length T-join can be determined in 
polynomial time. These are useful for an intuition, even if we only use the weaker Theorem [2] 
below. For an introduction and more about different aspects of T-joins, see j!7j . |20| . [9], |16| . 

Linear Relaxation: We adopt the polyhedral background and notations of |21j . 
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Let G = (V, E) be a graph. For a partition W of V we introduce the notation 

S(W) := |J 5(W), 
wew 

that is, <5(W) is the set of edges that have their two endpoints in different classes of W. 
Let G be a connected graph, and T C V with |T| even. 

P(G, T) : = {iel E : x(<5(W)) > 2 for all ^ W C F with |W n T\ even, 

x(<J(W)) > |W| - 1 for all partitions W of V, 

< x(e) < 2 for all e G ^j. 

Denote opt(G,T,c) the length of the shortest T-tour for input (G,T,c). Let x* G P(G,T) 
minimize c T x on P(G,T). 

Fact: Given (G, T, c), opt(G, T, c) > min^gp^^) c T x = c T x* . 

Indeed, if F is a T-tour, xf satisfies the defining inequalities of P(G, T). 

The following theorem is essentially the same as Schrijver [201 page 863, Corollary 50.8]. 

Theorem 1 Let x G Q E satisfy the inequalities 

x(8(W)) > |W| — 1 for all partitions W ofV, 
< x(e) < 2 for all e G E. 

Then there exists a set T>o, l^ol < \E\ of spanning trees and coefficients Xp G R, Xf > 0, 
(F G J">o) so t/iat 

^2 X F = 1, x > ^ Afxf, 

and /or given x as input, J 7 >o, Ap (T G J^o) can be computed in polynomial time. 

Proof: Let x satisfy the given inequalities. If (2 >)x(e) > 1 (e G E), introduce an edge e' 
parallel to e, and define x'(e') := x(e) — 1, x'{e) := 1, and x'(e) := x{e) if x(e) < 1. Note that 
the constraints are satisfied for x' , and x' < 1. Apply Fulkerson's theorem |10j (see [20^ page 863, 
Corollary 50.8]) on the blocking polyhedron of spanning trees: x' is then a s convex combination 
of spanning trees, and by replacing e' by e in each spanning tree containing e'; applying then 
Caratheodory's theorem, we get the assertion. The statement on polynomial solvability follows 
from Edmonds' matroid partition theorem [7], or the ellipsoid method [13]. □ 

Note that the inequalities in Theorem Q] form a subset of those that define P(G, T). In 
particular, any optimal solution x* G P(G, T) for input (G, T, c) satisfies the conditions of the 
theorem. Fix P>o, Xf provided by the theorem for x* , that is, 

Y^j X f*f < x*. 
FeF >0 
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We fix the input (G,T,c) and keep the definitions x* , J->o, Ap until the end of the paper. 

It would be possible to keep the context of pQ for s 7^ t where metrics in complete graphs 
are kept and only Hamiltonian paths are considered (so the condition x(5(v)) = 2 if v 7^ s, 
v 7^ t is added), or the corresponding generalization in [3] for T / 0. However, we find it more 
comfortable to have in mind only (G,T, c), where c is the given function which is not necessarily 
a metric, and G is the original graph that is not necessarily the complete graph, and without 
having a restriction on T. The paper can be read though with either definitions in mind, the only 
difference being the use of YIfzt ^ F %F < x* without the irrelevant equality here to hold. 

The reader can also substitute T = {s,t} (s,t 6 V with s = t allowed, meaning T = 0) for 
easier reading, none of the relevant features of the proofs will disappear. 

Last, we state a well-known analogous theorem of Edmonds and Johnson for the blocking 
polyhedron of T'-joins in the form we will use it. (The notation T is now fixed for our input 
(G,T, c), and the theorem will be applied for several different T' in the same graph.) 

Theorem 2 [8], (cf. [Hj, [20]) Given (G,T',c), (T C V , \T'\ even, c : E — ► R + ), let 

Q+{G,T) : = {x € R E :x{C) > 1 for each T' -cut C, x(e) > for all e G E}. 

A shortest T' -join can be found in polynomial time, and if x € Q+(G,T'), t(G,T',c) < c T x. 



The guarantee of Christofides' algorithm for T-tours 

We finish the introduction to the T-tour problem with a proof of the 5/3-approximation ratio 
for Christofides's algorithm. Watch the partition of the edges of a spanning tree into a T-join -if 
T = {s,t}, an {s,t} path - and the rest of the tree in this proof! For {s,£}-paths this ratio was 
first proved by Hoogeveen [15] slightly differently (see for T-tours in the Introduction of |21j). 
and in [14] in a similar way, as pointed out to me by David Shmoys. 

Proposition: Let F be an arbitrary c-minimum spanning tree. Then t(G, Tp AT, c) < |opt(G, T, c). 

Proof: {F(T),F \ F(T)} is a partition of F into a T-join and a TATp-join (see Figure [T]). 
The shortest T-tour K has a Tp-join F' by connectivity, so {F' , K \ F'} is a partition of K to a 
Tp-join and a TpAT-join. 

If either c(F \ F(T)) < fc(-F) or c(K \ F') < |c(if), then we are done, since both are 
TATp-joins. If neither hold, then we use the TATp-join F(T)AF'. Since c(F(T)) < ±c(F) < 
ioPT(G, T,c) and c(F') < \c{K) = |opt(G, T,c), we have c(T(T)AT") < c(F(T)) + c(F') < 
|opt(G, T,c). □ 

In the next section we exploit this simple argument in a more advanced context (see Propo- 
sition and its Corollary) that anticipates the proof of the main result. 



6 



3 Results 



In this section we introduce the "language" of the paper, random sampling, that has been proved 
to be helpful for numerous problems. The ancestor of the method for the TSP can be viewed 
to be Wolsey's proof [22] of opt(G, 0,c) < 3/2c T x*, improved recently in the cardinality case by 
|12j . [18] . [19] . and for T-tours by [I], [4]. Our use of probabilities here is only notational though, 
but an elegant notation does really help. In the second half of this section we state and prove the 
key lemmas. 

The random sampling framework has been used by An, Kleinberg and Shmoys for TSP paths 
in a simple and original way with surprising success [T] . Readers familiar with pQ may find helpful 
the explanations in Section [5] about the relation of the new results to this framework. In this 
section watch the new ideas contributed by the present work: the separation of x* into p* and 
q* , and a further decomposition of p* . 

The coefficient A^ of each spanning tree F € J->o in the convex combination dominated by 
x* (see Theorem [Q) will be interpreted as a probability distribution of a random variable J 7 , 

Pr(J" = F) := X F 

whose values are spanning trees of G, and 

F >0 = {F CE :F spanning tree of G, Pr(J" = F) > 0}. 

The notations for spanning trees will also be used for random variables whose values are 
spanning trees. For instance J-(T) denotes the random variable whose value is F(T) precisely 
when T = F. Another example is a random variable whose value is xf when T = F. 
Similarly, Tjr is a random variable whose value for T = F is Tp := {v 6 V : \5(v) fl F\ is odd}. 

We use now the probability notation for defining two vectors that will be extensively used: 

p*{e) : = Pr(e € F(T)); q*{e) := Pr(e € J r \T(T)) (e G E). (These are short notations for 
the sum of for spanning trees F with e € F(T) or e € F \ F(T), respectively.) 

Fact: E[x H t)] = P*, E[ Xf \ H t)\ = q* , E[ X A = p* + q* < x* . Proof: Apply TheoremHJ □ 

Let us familiarize with the introduced vectors p* , q* by sharpening the proposition at the end 
of the preceding section using the minimum objective value of the fractional relaxation. This is 
irrelevant for the proofs in the sequel, but shows the intuition of using p* and q* . 

Proposition: For each T C V, \T'\ even, \(x* + p*) G Q+(G,T'). 

Let Q := {Q is a cut: x*(Q) < 2}. The assertion is that p* repairs the deficit of each Q € Q. 

Proof: If C is a cut, C £ Q, then x*{C) > 2, so \{x*{C) +p*) > \x*{C) > 1. If C 6 Q: 

x*{C) + p*(C) > E[xt]{C) + E[xt(t)]{C) > 2, since the event |C n T\ = 1 implies that the 
unique edge of C n T is also contained in F(T). (The T-cut C intersects every T-join.) □ 
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A t, a t 




Figure 1: One of many: T F AT-joins, in F (left), minimum in G (right), ,J F ; T := {s,t}. 



Corollary: E[t(G, Tj-AT, c)] < mm{c T x* 



T * 
C </ 



} < 3C X . 



Proof: Apply the Proposition and Theorem [2] to get t(G,T',c) < c ^(x* Applying this 

to T' = T F AT (F G J" >0 ), and then substituting Ux* + p*) < x* - \q* (by the Fact), and finally 
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taking the mean value: E[t(G, Tj- AT, c)] < c T x* 

On the other hand, since T\T(T) is a 7>AT-join, E[t{G, Tj- AT, c)] < £[c(7"\ J"(T))] = c T ( 



The minimum of our two linear bounds takes its maximum value at c T c 



□ 



While c T q* (the second upper bound of the corollary) is the mean value of the length of the 
parity correcting T \ T(T), \{x* + p*) (of the first bound) is in Q+(G,T) for all T C V, \T'\ 
even. This "for all" is a superfluous luxury! Indeed, it is not very economic to add p* for all 
F £ J->o, when a smaller vector, adapted to F (see below) is enough! 

The reader may find helpful to have a look at Figure [1] for these remarks, for the following 
algorithm and for the subsequent arguments and theorem. 



Best of Many Christofides Algorithm pQ: Input (G,T,c). 

Determine x* [13] using [2], see [21]. (Recall: x* is an optimal solution of T^O-xeP(G,T) c_l 



x. 



Determine J r >o- (see Theorem [T] and its proof.) 

Determine the best parity correction for each F € Fyo, i.e. a shortest TpAT-join Jp [8], |16j . 
Output that F + Jp (F G -7 7 >o) f° r which c(F + Jp) is minimum. 



When T = (s = t) Wolsey [22] observed that x*/2 G Q+(G,T) and then by Theorem H 
parity correction costs at most c T x*/2, so Christofides's tour is at most 3/2 times c T x*; in [T], 
[3] and here this analysis is refined for paths and in general for T-tours. 

Deflne R := min 4F )+ T(G,T F AT,c) < E W + r(G.I>AT,c)] < r(G,?>AT,c) 

F6J>0 C 1 X* C 1 X* C 1 X* 

Ratios of tour lengths versus c T x* may be better than R, since Christofides' way of choosing a 
spanning tree and adding parity correction is not the only way for constructing tours. For in- 
stance Momke and Svensson [18] get better results for some problems by starting from larger 
graphs than trees and deleting some edges instead of adding them for parity correction. However, 
here we are starting with trees and correct their parity by adding edges for deducing the ratio 
R < 8/5 through the following theorem, the main result of the paper: 
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Theorem 3 E[t(G, Tj-AT, c)] < -c r x* 

Recall Q := {Q is a cut: x*(Q) < 2}. Every Q £ Q is a T-cut, since non-T-cuts C are required 
to have x(C) > 2 in the definition of P(G,T). In p] it is proved that the vertex-sets defining Q 
form a chain if |T| = 2; in [1] they are proved to form a laminar family for general T. We do not 
use these properties, but we need the following simple but crucial observation from [I]: 

Lemma 4 If C is a cut, then PrflC n F\ > 2) < x*(C) - 1, PrflC nJ| = 1) > 2 - x*(C). 
Moreover if C G Q, i/ien t/ie ewent |Cn = 1 implies that C is not a Tj^AT-cut. 

Proof: If C is a cut of G, s*(C) > E[\C D > Pr(|C nJ| = l) + 2 PrflC nJ| > 2), where 
PrflC n F\ = 1) + Pr(|C n J 7 1 > 2) = 1, so the inequalities follow for an arbitrary cut. The last 
statement also follows, since C G Q implies that C is a T-cut, and on the event |C PI JF\ = 1 it is 
also a Tjr-cwt -by degree counting-, so it is not a Tj-AT-cut, as claimed. □ 

It is for cuts C € Q that this lemma provides relevant information. An, Kleinberg and 
Shmoys pQ need and prove more about Q, their main technical tool [H Lemma 3] is actually a 
linear programming fact about this family which is the more difficult half of their proof. Cheriyan, 
Friggstad, Gao [1] generalize these properties. The following two lemmas provide a natural simple 
alternative to this approach, inherent in the problem: 

Lemma 5 If C\ ^ C2 are cuts of G, e G E, then the events {e} = C\ n T and {e} = C2 D J 7 are 
disjoint, and if they are T-cuts, these events are included in the event e G F(T). 

The statement is true for arbitrary cuts C\, C2, but it will be applied only for C\, C2 G Q. 

Proof: Indeed, {e} = C± fl F for some F G J-">o means that e is the unique edge of F in C\, 
so Ci is the set of edges of G joining the two components of F \ {e}. If C\ 7^ C2, then the event 
that F\ {e} defines C\ or that it defines C2, mutually exclude one another. 

Moreover, if say C\ is a T-cut, then it has a common edge with every T-join, so in the event 
{e} = C\ n F we have e G F(T), proving the last statement. □ 

For all Q G Q and e £ E define x®(e) := Pr({e} = Q fl J 7 ). In linear terms G ~R E is 
equivalently defined as 

x Q ■= ^2 XfXqhf- 
Fe^>o,|QnF|=l 

Lemma 6 Outside Q, x® is 0. Moreover, l T x® = x®(Q) > 2 — :r*((2), and 

^ x Q <p*. 

QeQ 
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Proof: IfegQ, then e £ Q n F for all F G J> , so x Q (e) := Pr({e} = QnJ) = 0. Now 

l T x Q := £ A F l T x Q nF= E A F l = Pr(|Qn7-| = 1), 

Fe^ >0 ,|Qn-F|=i Fe.F>o,|QnF|=i 

so the first inequality follows now from Lemma [4) To see the second inequality note that for each 
eCB, 

E^( e )=E E A F =^Pr(QnJ = H), 

QeQ QeS FeJ >0 ,QnF={e} QeS 

and by Lemma [5] this is at most Pr(e 6 J~(T)) = p*(e). □ 

4 Proof 

In this section we prove the promised approximation ratio (Theorem [3|). As [I], we want to 
complete the random variable /3x* + (1 — 2/3)xf, 1/3 < /3 < 1/2 to one that is in Q + (G, Tj-AT), 
by adding a random variable. The length expectation of what we get then is an upper bound 
for the price t(G,Tj?AT,c) of parity correction, by Theorem [2j The difficulty is to estimate the 
length expectation of the added random variable in terms of c T x*. 

Why just the form [3x* + (1 — 2/3)xf? We follow pQ here: for all cuts C £ Q, that is, if 
x* (C) > 2, we have then /3x* (C) + (1 - 2j3)xr{C) > 2/3 + 1-2/3 = 1. By this choice it is sufficient 
to add correcting vectors to Tj-AT-cuts in Q, and we do not know of any alternative for this. 

Why just in the interval 1/3 < < 1/2 ? We need 1 - 2/3 > 0; < 1/3 would make the 
approximation ratio at least 5/3. 

For any cut C we call the random variable max{0, 1 — (/3x*(C) + (1 — 2/3) \C H J 7 !)} the deficit 
of C for /3, unless C € <2, |C fl -F| = 1, when we define the deficit to be (see Lemma[l|). 

Lemma 7 The deficit of a Tj^AT-cut C for /3 (/3 G (1/3,1/2)) is constantly 0, unless C £ Q 
and |C n J 7 ] > 2, and u>/ien is positive, it is never larger than 

Note that this value can be negative, but then the deficit of C is constantly 0. 

Proof: If C £ Q, then x*(C) > 2, and we saw three paragraphs above that the deficit of C for 
/3 is 0. If C G Q then C is a T-cut; if in addition \C D J- \ = 1, then C is also a 7y-cut, so it is 
not a T^AT-cut (Lemma H|), and the deficit is defined to be 0. 

We proved: if C is a 7>AT-cut and the deficit of C for /3 is not 0, then \C H T\ > 2. 
Substituting this inequality to the deficit: 1 - {fix* (C) + (1 - 2/3)|Cn J 7 )) < 4/3 - 1 - /3s* (C). □ 

Let /«(/?) := max{0, 4 ^g{ Q) }, and ^(/3) := E Qe Q,|QnF|>2 f Q (P)x Q - 
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Figure 2: The approximation guarantee cannot be improved below 3/2. This example is essentially the same as 
the more complicated one in [211 Figure 3] providing the same lower bound for a more powerful algorithm in the 
cardinality case. |V| = 2k, OPT(G,T, 1) = c T x* = 2k - 1 (left). Best of Many Christofides output (right): 3k - 2 
if .F>o consists of the thick (red) tree and its central symmetric image. There are more potential spanning trees 
for J-yo, but t(G,TfAT,1) > k — 2 for each, so c(F + Jf) > 3fc — 3 for each, and with any TpAT-join Jf- 



Lemma 8 /3x* + (1 - 2@)xf + S T (P) 6 Q + (G,TjrAT) is the sure event for all (3 £ (1/3, 1/2). 

Proof: By Lemma El x Q (Q) >2-x*(Q), so f Q {(3)x Q (Q) > 4/3 - 1 - /3x*(Q) by substituting 
the above definition of f®(f3). On the other hand, by Lemma [71 the deficit of a TjAT-cut, if 
positive at all, is at most 4/3 — 1 — (3x*(Q). □ 

Theorem [3] EMG, Tj-AT, c)l < - c T x*. 

5 

Figure [2] shows that this bound cannot be decreased below 1/2 c T x*. 
Proof: Fix /3, 1/3 < /3 < 1/2. 

Claim 1: E[t(G,Tj?AT,c)] < (1 - ^)c T x* + c T E[s^(P)} for all 1/3 < /3 < 1/2. 

By Lemma© /3x* + (1-2/3)xj- + s- f (/3) G Q + (G,Tj?AT) is the sure event, so r(G, Tj-AT, c) < 
c T (/3:c* + (l — 2/3)x_f + s" F (/3)) also always holds. Taking the expectation of both sides and applying 
E{c T {(3x* + (1 - 2P)xt] < (1 - /3)c T x* (Fact of Section [3]), the Claim is proved. 

W-J-w) 



Claim 2: For each Q € Q, Pr(|Qn7l > 2)/ Q (/3) < S. w here < w = A - 2 < 1. 

1 — w » p 

By LemmaEl PrflQ n -Fj > 2)/«(/3) < (x*(Q) - l)/«(/3) < msx(x*(Q) - l) ^zlz^g) . 

Substitute w := x*(Q) — 1. Then the quantity to maximize becomes the function of u in the 
claim. This function takes its maximum at the given value of u, and if 1/3 < /3 < 1/2 then 
< uj < 1, proving the Claim. 

/3w(3-i-w) r; 

To be concise, denote /(/3) := r^z > where cj = 1 — */-g — 2. 



Claim 3: £[5^/3)] < /(/3)p*. 

= E Pr (^ = F ) E Z ^)^ = E Pr (l^ n ^1 ^ 2)/«(/3)x« < 
FeJ 7 QeQ,|QnF|>2 QeQ 

< /(/3) ^ x Q , by Claim 2. Finally, substituting Eq s q 

X Q < p* (Lemma [6]) we get the claim. 

QeQ 

Now we are ready to finish the proof of the theorem. By Claim 1 and Claim 3, we have: 
E[r(G, Tj-AT, c)] < (1 - /3)c T x* + /(/3)cV, 
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where for all e € R, e > 0, either c T p* < — e)c T x*, or c T q* < (g + e)c T x* because p* + q* < x* 
(Fact of Section [3]). So - using the Fact again -, if the latter case holds we have: 

E[r(G, 7> AT, c)} < E[c{T \ J*(T))] = c T q* < (i + e)c T x* , 

and if the first case holds we can substitute c T p* < — e)c T x* to the result we got before: 

E[r(G,T^AT,c)} < (l-/3) c T x* + (i-e)/(/3)c T x*. 

We got two upper bounds for E[t(G, Tj-AT, c)], both having, for any fixed /3, linear functions of e 

1 (3 

as coefficients of c T x* . The minimum of the two functions has its maximum at e = — — 

2 HP) + 1 

which, as a function of /?, has a unique minimum at (3 = 4/9 (and then ui = 1/2, f(P) = 1/9), 
with minimum value e = 1/10. □ 



5 Connections 

Finally, we explain the connection of the results to their immediate predecessor, to some variants 
and to some open questions. 

5.1 First, we explain the content of this work in terms of An, Kleinberg and Shmoys pQ: 

Replace the fy. provided by [H Lemma 3] - whose existence is proved with linear-programming 

and network flow methods - by the vector x®, x®(e) := Pr({e} = QnJ 7 ), see just above LemmaEJ 

The Lemma provides alternative simple properties for x® that turn out to be more advantageous 

than those of fjfr., moreover easy to prove. 

The result of this change is that the maximum possible deficit /3oj (r — uj) of T'-cuts for a 

tentative T'-join dominator', where uj = r/2 (the place of the maximum) in pQ, is replaced by 

^ UJ ^ T — where lo = 1 — ^fjj — 2 (the new place of the maximum), see Claim 2 of the proof. 

Another advantage is due to the fact that the new vectors sum up to a smaller vector than 
c T x*: actually to at most c T x*/2, and are in fact dominated by p* (Lemma [6]), where c T p* < 
(i - e)c T x* unless c T q* < (~ + e)c T x* (Fact in Section [3|). 

Despite these advantages, I cannot compare f^. and x® directly. Therefore it seemed rea- 
sonable to hope that combining the two may further improve the bound 8/5. Figure [3] is the 
Wolfram Alpha output showing that this is not the case. 

If the coefficient of the sum of the fy_ is y - this is the only single number that determines the 
extent of acting as pQ did -, our formulas in Section[4]are revised as follows. In Lemma[7]the upper 
bound becomes 4/3 — 1 — f3x*(Q) — y and then replacing Claim 1 and redoing Claim 3 accordingly 
(cf. the conclusion of these in the two lines following the proof of Claim 3), furthermore replacing 
f Q (P), /(/?) by the two-variable functions f Q (P,y), f(P,y): 

E[t(G, T t AT, c)] < (1 - + y)c T x* + c T E[s^(p)} < (1 - p + y)c T x* + f(p, y)c T p* , 
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v := the coefficient of the vector 
defined in [ 1 , Lemma 3] 

The guaranteed ratio is l A + the min 

Figure 3: Mixing the performance of our analysis with that of An, Kleinberg, Shmoys |T], optimally. 

I was not able to exclude by hand that the minimum of this two-variable function for e could 
be smaller than 1/10. Thanks to Louis Esperet and Nicolas Catusse for a pointer and a first 
guiding to Wolfram Alpha, that provided the answer of Figure [3j and to Sebastian Pokutta who 
has double-checked the computations, with Mathematica. Of course, besides fy. and there 
may be many other vectors to combine, and other possibilities for improvement. 

5.2 The results of the paper have obvious corollaries according to reductions of variants of the 
TSP to the TSP path problem, as a black box, for instance: 

For the clustered traveling salesman problem p3] in which vertices of pairwise disjoint sets 
have to be visited consecutively, the update for the performance guarantee, where the number of 
clusters is a constant, is 8/5; substituting our results to [I], we get that the prize- collecting s-t 
path TSP problem, is 1.94837-approximable. 

5.3 Some of the questions that arise may be easier than the famous questions of the field: 
Could the results of [21] 3/ 2- approximating minimum size T-tours or 7/ 5- approximating tours 

be reached with the Best of Many Christofides algorithm ? Could the methods make the so far 
rigid bound of 3/2 move down at least for shortest 2- edge- connected multigraphs ? 

Acknowledgment 

Many thanks to the organizers and participants of the Cargese Workshop of Combinatorial Optimization 
devoted to the TSP problem, for their time and interest, furthermore to Corinna, Jens, Kenjiro, Marcin 
and Zoli Szigeti for their comments on this manuscript. I am highly indebted to Joseph Cheriyan, Zoli 
Kiraly and David Shmoys for their prompt and pertinent opinions before my presentation, to R. Ravi and 
Attila Bcrnath, for their continuous interest and wise suggestions. 

Thanks are also due to an anonymous pickpocket for a free day I could spend at Orly Airport, and 
to Easyjet for a delayed flight followed by a night I could spend at Saint Exupcry Airport. This research 
began, thanks to their accidental, but helpful, day and night contributions. 



l + ■ 



/!t „(-l-,«. 3 )} = -L at(„,s) = (4 0,1) 



13 



References 



An, H.-C, Kleinberg, R., and Shmoys, D.B., Improving Christofides' algorithm for the s-t path TSP. 
Proceedings of the 44th Annual ACM Symposium on Theory of Computing (2012), to appear 

Barahona, F., Conforti, M., A construction for binary matroids. Discrete Mathematics 66 (1987), 
213-218 

Christofides, N., Worst-case analysis of a new heuristic for the traveling salesman problem. Technical 
Report 388, Graduate School of Industrial Administration, Carnegie-Mellon University, Pittsburgh 
(1976) 

Cheriyan, J., Friggstad, Z., Gao, Z., Approximating Minimum-Cost Connected T- Joins, 
larXiv:1207.572l rl [cs.DS] (2012) 

Cook, W.J., In Pursuit of the Traveling Salesman: Mathematics at the Limits of Computation. 
Princeton University Press 2012 

Cornuejols, G., Fonlupt, J., and Naddef, D., The traveling salesman problem on a graph and some 
related integer polyhedra. Mathematical Programming 33 (1985), 1-27 

Edmonds, J., Sub-modular functions, matroids and certain polyhedra. In: Combinatorial Structures 
and Their Applications; Proceedings of the Calgary International Conference on Combinatorial Struc- 
tures and Their Applications 1969 (R. Guy, H. Hanani, N. Sauer, J. Schonheim, eds.), Gordon and 
Breach, New York 1970, pp. 69-87 

Edmonds, J., and Johnson, E.L., Matching, Euler tours and the Chinese postman. Mathematical 
Programming 5 (1973), 88-124 

Frank, A., Connections in Combinatorial Optimization. Oxford University Press 2011 

Fulkerson, D.R., Blocking Polyhedra, in: Graph Theory and Its Applications (Proceedings Advanced 
Seminar Madison, Wisconsin, 1969; B. Harris ed.) Academic Press, New York, 1970, pp. 93-112 

Garey, M.R., Johnson, D.S., and Tarjan, R.E., The planar Hamiltonian circuit problem is NP- 
complete. SIAM Journal on Computing 5 (1976), 704-714 

Gharan, S.O., Saberi, A., and Singh, M., A randomized rounding approach to the traveling salesman 
problem. Proceedings of the 52nd Annual IEEE Symposium on Foundations of Computer Science 
(2011), 550-559 

Grotschel M., Lovasz L., and Schrijver A., The ellipsoid method and its consequences in combinatorial 
optimization, Combinatorica, 1(2) (1981), pp 169-197. 

Guttmann-Beck N., Hassin R., Khuller S., Raghavachari B., Approximation Algorithms with Bounded 
Performance Guarantees for the Clustered Traveling Salesman Problem, Algorithmica 28 (2000), 422- 
437 

Hoogeveen, J. A., Analysis of Christofides' heuristic, some paths are more difficult than cycles, Oper- 
ations Research Letters, 10 (5) (1991), 291-295 

Korte B., and Vygen, J., Combinatorial Optimization, Springer 2012, Fifth Edition. 

Lovasz, L., and Plummer, M.D., Matching Theory. Akademiai Kiado, Budapest 1986, and North- 
Holland, Amsterdam 1986 

[18] Momke, T., and Svensson, O., Approximating graphic TSP by matchings. Proceedings of the 52nd 
Annual Symposium on Foundations of Computer Science (2011), 560-569 



14 



[19] Mucha, M., -^-approximation for graphic TSP. Proceedings of the 29th International Symposium on 
Theoretical Aspects of Computer Science (2012), 30-41 

[20] Schrijver, A., Combinatorial Optimization. Springer 2003 

[21] Sebo, A., and Vygen, J., Shorter Tours by Nicer Ears: 7/5-approximation for graphic TSP, 3/2 for 
the path version, and 4/3 for two-cdge-connccted subgraphs, arXiv:1201. 1870^3 [cs.DM] (2012) 

[22] Wolsey, L.A., Heuristic analysis, linear programming and branch and bound. Mathematical Program- 
ming Study 13 (1980), 121-134 



15 



